Overview

Dataset statistics

Number of variables19
Number of observations110000
Missing cells105605
Missing cells (%)5.1%
Duplicate rows11754
Duplicate rows (%)10.7%
Total size in memory15.9 MiB
Average record size in memory152.0 B

Variable types

NUM12
CAT7

Reproduction

Analysis started2021-03-22 16:38:15.529571
Analysis finished2021-03-22 16:39:26.474066
Duration1 minute and 10.94 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 11754 (10.7%) duplicate rows Duplicates
Loan ID has a high cardinality: 88354 distinct values High cardinality
Customer ID has a high cardinality: 88354 distinct values High cardinality
Credit Score has 21135 (19.2%) missing values Missing
Annual Income has 21135 (19.2%) missing values Missing
Years in current job has 4649 (4.2%) missing values Missing
Months since last delinquent has 58447 (53.1%) missing values Missing
Annual Income is highly skewed (γ1 = 44.99481956) Skewed
Maximum Open Credit is highly skewed (γ1 = 138.1501683) Skewed
Loan ID is uniformly distributed Uniform
Customer ID is uniformly distributed Uniform
Number of Credit Problems has 94688 (86.1%) zeros Zeros
Bankruptcies has 97669 (88.8%) zeros Zeros
Tax Liens has 107872 (98.1%) zeros Zeros

Variables

Loan ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count88354
Unique (%)80.3%
Missing0
Missing (%)0.0%
Memory size859.4 KiB
5242ffbc-c33e-4b68-967f-68582a35b014
 
2
88af1b78-bac7-4a81-874f-b25be4d1116d
 
2
ba12554d-f81a-4250-bf8a-3f6f19110cee
 
2
85622249-9ed4-4174-b4b3-5fd3bbdc2bd7
 
2
1cc4bb6a-cd15-43f7-a7c1-e0e10b399359
 
2
Other values (88349)
109990
ValueCountFrequency (%) 
5242ffbc-c33e-4b68-967f-68582a35b0142< 0.1%
 
88af1b78-bac7-4a81-874f-b25be4d1116d2< 0.1%
 
ba12554d-f81a-4250-bf8a-3f6f19110cee2< 0.1%
 
85622249-9ed4-4174-b4b3-5fd3bbdc2bd72< 0.1%
 
1cc4bb6a-cd15-43f7-a7c1-e0e10b3993592< 0.1%
 
3484cd61-d9e4-4f54-963f-2792b42adbea2< 0.1%
 
121f4711-bcb9-4e7a-86c6-a4d56743031d2< 0.1%
 
e95c1044-3989-4c3c-b3f0-2c18c9abbd732< 0.1%
 
e65b73a0-94bf-4119-afcb-38a9ef338ae52< 0.1%
 
a1105c75-ee33-460f-b0db-40904e380ee42< 0.1%
 
Other values (88344)109980> 99.9%
 

Length

Max length36
Median length36
Mean length36
Min length36

Customer ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count88354
Unique (%)80.3%
Missing0
Missing (%)0.0%
Memory size859.4 KiB
7274ca68-e768-4396-a945-1f68685ab21c
 
2
05c0a15f-812b-4d86-bcd2-30c2e1b55c0f
 
2
12ab2e43-b6dd-42f2-82f7-4bbdfc38e08a
 
2
f6c2125f-2310-4027-bcd5-48ae40b8d48f
 
2
83375a14-3b11-47e3-adcc-aba68b9aa3a4
 
2
Other values (88349)
109990
ValueCountFrequency (%) 
7274ca68-e768-4396-a945-1f68685ab21c2< 0.1%
 
05c0a15f-812b-4d86-bcd2-30c2e1b55c0f2< 0.1%
 
12ab2e43-b6dd-42f2-82f7-4bbdfc38e08a2< 0.1%
 
f6c2125f-2310-4027-bcd5-48ae40b8d48f2< 0.1%
 
83375a14-3b11-47e3-adcc-aba68b9aa3a42< 0.1%
 
f71fa7f1-1b1a-4854-8ce4-9eb38fe50dfe2< 0.1%
 
1e5ea8eb-ceeb-4d0d-a77a-69d29ffaa5f52< 0.1%
 
b4f98d59-921f-4e94-8017-fef1fb0bec842< 0.1%
 
ed8ca528-b5a2-4839-85bb-01ed226ed8f82< 0.1%
 
040a5b08-32b2-40db-a2bd-09a0d85e2c752< 0.1%
 
Other values (88344)109980> 99.9%
 

Length

Max length36
Median length36
Mean length36
Min length36

Loan Status
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size859.4 KiB
Fully Paid
85054
Charged Off
24946
ValueCountFrequency (%) 
Fully Paid8505477.3%
 
Charged Off2494622.7%
 

Length

Max length11
Median length10
Mean length10.22678182
Min length10

Current Loan Amount
Real number (ℝ≥0)

Distinct count22502
Unique (%)20.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11746206.8279
Minimum10802
Maximum99999999
Zeros0
Zeros (%)0.0%
Memory size859.4 KiB

Quantile statistics

Minimum10802
5-th percentile75944
Q1179586
median312026
Q3523930
95-th percentile99999999
Maximum99999999
Range99989197
Interquartile range (IQR)344344

Descriptive statistics

Standard deviation31767162.18
Coefficient of variation (CV)2.704461333
Kurtosis3.847793408
Mean11746206.83
Median Absolute Deviation (MAD)147312
Skewness2.418145009
Sum1.292082751e+12
Variance1.009152593e+15
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
999999991261711.5%
 
22332231< 0.1%
 
22365230< 0.1%
 
21619430< 0.1%
 
22310229< 0.1%
 
21681028< 0.1%
 
10896628< 0.1%
 
21870227< 0.1%
 
22226627< 0.1%
 
22259627< 0.1%
 
Other values (22492)9712688.3%
 
ValueCountFrequency (%) 
108021< 0.1%
 
112421< 0.1%
 
154222< 0.1%
 
194701< 0.1%
 
210981< 0.1%
 
ValueCountFrequency (%) 
999999991261711.5%
 
7892503< 0.1%
 
7891846< 0.1%
 
78909617< 0.1%
 
78903012< 0.1%
 

Term
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size859.4 KiB
Short Term
79503
Long Term
30497
ValueCountFrequency (%) 
Short Term7950372.3%
 
Long Term3049727.7%
 

Length

Max length10
Median length10
Mean length9.722754545
Min length9

Credit Score
Real number (ℝ≥0)

MISSING

Distinct count326
Unique (%)0.4%
Missing21135
Missing (%)19.2%
Infinite0
Infinite (%)0.0%
Mean1076.594643560457
Minimum585.0
Maximum7510.0
Zeros0
Zeros (%)0.0%
Memory size859.4 KiB

Quantile statistics

Minimum585
5-th percentile662
Q1705
median724
Q3741
95-th percentile6700
Maximum7510
Range6925
Interquartile range (IQR)36

Descriptive statistics

Standard deviation1475.581902
Coefficient of variation (CV)1.370601193
Kurtosis12.96574859
Mean1076.594644
Median Absolute Deviation (MAD)17
Skewness3.862453899
Sum95671583
Variance2177341.951
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
74720221.8%
 
74019411.8%
 
74619191.7%
 
74119071.7%
 
74218991.7%
 
73917881.6%
 
74517691.6%
 
74817551.6%
 
74317051.6%
 
72517011.5%
 
Other values (316)7045964.1%
 
(Missing)2113519.2%
 
ValueCountFrequency (%) 
58513< 0.1%
 
5868< 0.1%
 
58712< 0.1%
 
58821< 0.1%
 
5896< 0.1%
 
ValueCountFrequency (%) 
751010< 0.1%
 
750027< 0.1%
 
749024< 0.1%
 
748048< 0.1%
 
747054< 0.1%
 

Annual Income
Real number (ℝ≥0)

MISSING
SKEWED

Distinct count37853
Unique (%)42.6%
Missing21135
Missing (%)19.2%
Infinite0
Infinite (%)0.0%
Mean1377449.0304394306
Minimum76627.0
Maximum165557393.0
Zeros0
Zeros (%)0.0%
Memory size859.4 KiB

Quantile statistics

Minimum76627
5-th percentile519859
Q1848844
median1173459
Q31651670
95-th percentile2808675
Maximum165557393
Range165480766
Interquartile range (IQR)802826

Descriptive statistics

Standard deviation1063919.586
Coefficient of variation (CV)0.7723839955
Kurtosis6433.206718
Mean1377449.03
Median Absolute Deviation (MAD)381197
Skewness44.99481956
Sum1.224070081e+11
Variance1.131924885e+12
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
116257225< 0.1%
 
96947520< 0.1%
 
97337020< 0.1%
 
113749218< 0.1%
 
95323018< 0.1%
 
114000018< 0.1%
 
114376218< 0.1%
 
96567517< 0.1%
 
114661217< 0.1%
 
111264017< 0.1%
 
Other values (37843)8867780.6%
 
(Missing)2113519.2%
 
ValueCountFrequency (%) 
766271< 0.1%
 
810922< 0.1%
 
914851< 0.1%
 
948671< 0.1%
 
970331< 0.1%
 
ValueCountFrequency (%) 
1655573931< 0.1%
 
364754401< 0.1%
 
308389951< 0.1%
 
280953001< 0.1%
 
241615401< 0.1%
 

Years in current job
Categorical

MISSING

Distinct count11
Unique (%)< 0.1%
Missing4649
Missing (%)4.2%
Memory size859.4 KiB
10+ years
34206
2 years
10050
3 years
9035
< 1 year
8959
5 years
7483
Other values (6)
35618
ValueCountFrequency (%) 
10+ years3420631.1%
 
2 years100509.1%
 
3 years90358.2%
 
< 1 year89598.1%
 
5 years74836.8%
 
1 year71086.5%
 
4 years67566.1%
 
6 years62525.7%
 
7 years61315.6%
 
8 years50544.6%
 
(Missing)46494.2%
 

Length

Max length9
Median length7
Mean length7.4697
Min length3

Home Ownership
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size859.4 KiB
Home Mortgage
53277
Rent
46397
Own Home
 
10096
HaveMortgage
 
230
ValueCountFrequency (%) 
Home Mortgage5327748.4%
 
Rent4639742.2%
 
Own Home100969.2%
 
HaveMortgage2300.2%
 

Length

Max length13
Median length8
Mean length8.742881818
Min length4

Purpose
Categorical

Distinct count16
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size859.4 KiB
Debt Consolidation
86430
other
 
6598
Home Improvements
 
6432
Other
 
3558
Business Loan
 
1732
Other values (11)
 
5250
ValueCountFrequency (%) 
Debt Consolidation8643078.6%
 
other65986.0%
 
Home Improvements64325.8%
 
Other35583.2%
 
Business Loan17321.6%
 
Buy a Car14071.3%
 
Medical Bills12401.1%
 
Buy House7480.7%
 
Take a Trip6170.6%
 
major_purchase4040.4%
 
Other values (6)8340.8%
 

Length

Max length20
Median length18
Mean length16.32622727
Min length5

Monthly Debt
Real number (ℝ≥0)

Distinct count69750
Unique (%)63.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18468.526823363638
Minimum0.0
Maximum435843.28
Zeros82
Zeros (%)0.1%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile3696.222
Q110211.93
median16202.44
Q323997.5225
95-th percentile40491.8975
Maximum435843.28
Range435843.28
Interquartile range (IQR)13785.5925

Descriptive statistics

Standard deviation12195.56439
Coefficient of variation (CV)0.6603431074
Kurtosis21.4810842
Mean18468.52682
Median Absolute Deviation (MAD)6714.125
Skewness2.217005718
Sum2031537951
Variance148731790.8
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0820.1%
 
1590310< 0.1%
 
10647.989< 0.1%
 
18710.259< 0.1%
 
14726.529< 0.1%
 
15343.079< 0.1%
 
12907.089< 0.1%
 
11162.889< 0.1%
 
13033.438< 0.1%
 
12656.478< 0.1%
 
Other values (69740)10983899.9%
 
ValueCountFrequency (%) 
0820.1%
 
7.412< 0.1%
 
12.921< 0.1%
 
17.11< 0.1%
 
19.571< 0.1%
 
ValueCountFrequency (%) 
435843.281< 0.1%
 
229057.922< 0.1%
 
205801.351< 0.1%
 
173265.562< 0.1%
 
172156.151< 0.1%
 

Years of Credit History
Real number (ℝ≥0)

Distinct count507
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.20248545454545
Minimum3.6
Maximum70.5
Zeros0
Zeros (%)0.0%
Memory size859.4 KiB

Quantile statistics

Minimum3.6
5-th percentile9
Q113.5
median16.9
Q321.7
95-th percentile31.7
Maximum70.5
Range66.9
Interquartile range (IQR)8.2

Descriptive statistics

Standard deviation7.015575411
Coefficient of variation (CV)0.3854185424
Kurtosis1.741236513
Mean18.20248545
Median Absolute Deviation (MAD)4
Skewness1.071562695
Sum2002273.4
Variance49.21829835
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1614811.3%
 
1514181.3%
 
1713461.2%
 
16.512971.2%
 
1412811.2%
 
15.411901.1%
 
1311371.0%
 
17.511141.0%
 
14.510841.0%
 
1810450.9%
 
Other values (497)9760788.7%
 
ValueCountFrequency (%) 
3.61< 0.1%
 
3.72< 0.1%
 
3.84< 0.1%
 
3.94< 0.1%
 
47< 0.1%
 
ValueCountFrequency (%) 
70.51< 0.1%
 
652< 0.1%
 
62.51< 0.1%
 
60.52< 0.1%
 
59.91< 0.1%
 

Months since last delinquent
Real number (ℝ≥0)

MISSING

Distinct count116
Unique (%)0.2%
Missing58447
Missing (%)53.1%
Infinite0
Infinite (%)0.0%
Mean34.90708591158614
Minimum0.0
Maximum176.0
Zeros233
Zeros (%)0.2%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile5
Q116
median32
Q351
95-th percentile75
Maximum176
Range176
Interquartile range (IQR)35

Descriptive statistics

Standard deviation21.96531522
Coefficient of variation (CV)0.6292508997
Kurtosis-0.7457595789
Mean34.90708591
Median Absolute Deviation (MAD)17
Skewness0.4353747213
Sum1799565
Variance482.4750726
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
129940.9%
 
139910.9%
 
159740.9%
 
149440.9%
 
99430.9%
 
89430.9%
 
189310.8%
 
109300.8%
 
69210.8%
 
169210.8%
 
Other values (106)4206138.2%
 
(Missing)5844753.1%
 
ValueCountFrequency (%) 
02330.2%
 
13080.3%
 
24590.4%
 
34910.4%
 
45570.5%
 
ValueCountFrequency (%) 
1762< 0.1%
 
1521< 0.1%
 
1481< 0.1%
 
1431< 0.1%
 
1411< 0.1%
 

Number of Open Accounts
Real number (ℝ≥0)

Distinct count52
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.124490909090909
Minimum0
Maximum76
Zeros2
Zeros (%)< 0.1%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile5
Q18
median10
Q314
95-th percentile20
Maximum76
Range76
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.01109345
Coefficient of variation (CV)0.4504559796
Kurtosis3.055216122
Mean11.12449091
Median Absolute Deviation (MAD)3
Skewness1.184541437
Sum1223694
Variance25.11105757
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
9102589.3%
 
1099809.1%
 
897058.8%
 
1194478.6%
 
789738.2%
 
1281427.4%
 
674426.8%
 
1368266.2%
 
1457125.2%
 
551974.7%
 
Other values (42)2831825.7%
 
ValueCountFrequency (%) 
02< 0.1%
 
127< 0.1%
 
24830.4%
 
314891.4%
 
431402.9%
 
ValueCountFrequency (%) 
762< 0.1%
 
562< 0.1%
 
551< 0.1%
 
522< 0.1%
 
484< 0.1%
 

Number of Credit Problems
Real number (ℝ≥0)

ZEROS

Distinct count14
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16805454545454546
Minimum0
Maximum15
Zeros94688
Zeros (%)86.1%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4843888211
Coefficient of variation (CV)2.882330971
Kurtosis49.49085709
Mean0.1680545455
Median Absolute Deviation (MAD)0
Skewness4.912341009
Sum18486
Variance0.23463253
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
09468886.1%
 
11323512.0%
 
214261.3%
 
34160.4%
 
41350.1%
 
5570.1%
 
619< 0.1%
 
78< 0.1%
 
95< 0.1%
 
84< 0.1%
 
Other values (4)7< 0.1%
 
ValueCountFrequency (%) 
09468886.1%
 
11323512.0%
 
214261.3%
 
34160.4%
 
41350.1%
 
ValueCountFrequency (%) 
151< 0.1%
 
121< 0.1%
 
112< 0.1%
 
103< 0.1%
 
95< 0.1%
 

Current Credit Balance
Real number (ℝ≥0)

Distinct count33641
Unique (%)30.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean294282.17156363634
Minimum0
Maximum32878968
Zeros627
Zeros (%)0.6%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile30399.05
Q1112195
median209665
Q3367483.75
95-th percentile760402.8
Maximum32878968
Range32878968
Interquartile range (IQR)255288.75

Descriptive statistics

Standard deviation377277.2719
Coefficient of variation (CV)1.282025581
Kurtosis673.3052374
Mean294282.1716
Median Absolute Deviation (MAD)115140
Skewness14.22660366
Sum3.237103887e+10
Variance1.423381399e+11
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
06270.6%
 
6769719< 0.1%
 
10030118< 0.1%
 
13780718< 0.1%
 
9078217< 0.1%
 
6568317< 0.1%
 
17597817< 0.1%
 
8842617< 0.1%
 
17183616< 0.1%
 
10474716< 0.1%
 
Other values (33631)10921899.3%
 
ValueCountFrequency (%) 
06270.6%
 
1912< 0.1%
 
3810< 0.1%
 
577< 0.1%
 
765< 0.1%
 
ValueCountFrequency (%) 
328789681< 0.1%
 
162374381< 0.1%
 
129869562< 0.1%
 
127463972< 0.1%
 
117964352< 0.1%
 

Maximum Open Credit
Real number (ℝ≥0)

SKEWED

Distinct count46468
Unique (%)42.2%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean754101.205331006
Minimum0.0
Maximum1539737892.0
Zeros743
Zeros (%)0.7%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile109934
Q1273157.5
median467390
Q3783194.5
95-th percentile1640320
Maximum1539737892
Range1539737892
Interquartile range (IQR)510037

Descriptive statistics

Standard deviation8014002.168
Coefficient of variation (CV)10.62722366
Kurtosis22216.17809
Mean754101.2053
Median Absolute Deviation (MAD)230758
Skewness138.1501683
Sum8.294962438e+10
Variance6.422423075e+13
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
07430.7%
 
23720414< 0.1%
 
16306413< 0.1%
 
37699213< 0.1%
 
32392813< 0.1%
 
10740413< 0.1%
 
24296813< 0.1%
 
23641213< 0.1%
 
24613613< 0.1%
 
29880412< 0.1%
 
Other values (46458)10913899.2%
 
ValueCountFrequency (%) 
07430.7%
 
43344< 0.1%
 
44441< 0.1%
 
53902< 0.1%
 
64465< 0.1%
 
ValueCountFrequency (%) 
15397378921< 0.1%
 
13047261701< 0.1%
 
9803052601< 0.1%
 
7982553701< 0.1%
 
6324777361< 0.1%
 

Bankruptcies
Real number (ℝ≥0)

ZEROS

Distinct count8
Unique (%)< 0.1%
Missing226
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean0.11761437134476288
Minimum0.0
Maximum7.0
Zeros97669
Zeros (%)88.8%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3512897492
Coefficient of variation (CV)2.98679273
Kurtosis18.2404431
Mean0.1176143713
Median Absolute Deviation (MAD)0
Skewness3.495656707
Sum12911
Variance0.1234044879
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
09766988.8%
 
11149710.5%
 
24630.4%
 
31070.1%
 
427< 0.1%
 
58< 0.1%
 
62< 0.1%
 
71< 0.1%
 
(Missing)2260.2%
 
ValueCountFrequency (%) 
09766988.8%
 
11149710.5%
 
24630.4%
 
31070.1%
 
427< 0.1%
 
ValueCountFrequency (%) 
71< 0.1%
 
62< 0.1%
 
58< 0.1%
 
427< 0.1%
 
31070.1%
 

Tax Liens
Real number (ℝ≥0)

ZEROS

Distinct count13
Unique (%)< 0.1%
Missing11
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.029448399385393086
Minimum0.0
Maximum15.0
Zeros107872
Zeros (%)98.1%
Memory size859.4 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2610544079
Coefficient of variation (CV)8.864808049
Kurtosis399.8374433
Mean0.02944839939
Median Absolute Deviation (MAD)0
Skewness15.62301278
Sum3239
Variance0.06814940388
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
010787298.1%
 
114741.3%
 
24050.4%
 
31240.1%
 
4660.1%
 
518< 0.1%
 
612< 0.1%
 
77< 0.1%
 
95< 0.1%
 
82< 0.1%
 
Other values (3)4< 0.1%
 
(Missing)11< 0.1%
 
ValueCountFrequency (%) 
010787298.1%
 
114741.3%
 
24050.4%
 
31240.1%
 
4660.1%
 
ValueCountFrequency (%) 
151< 0.1%
 
112< 0.1%
 
101< 0.1%
 
95< 0.1%
 
82< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

Loan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
014dd8831-6af5-400b-83ec-68e61888a048981165ec-3274-42f5-a3b4-d104041a9ca9Fully Paid445412Short Term709.01167493.08 yearsHome MortgageHome Improvements5214.7417.2NaN61228190416746.01.00.0
14771cc26-131a-45db-b5aa-537ea4ba53422de017a3-2e01-49cb-a581-08169e83be29Fully Paid262328Short TermNaNNaN10+ yearsHome MortgageDebt Consolidation33295.9821.18.0350229976850784.00.00.0
24eed4e6a-aa2f-4c91-8651-ce984ee8fb265efb2b2b-bf11-4dfd-a572-3761a2694725Fully Paid99999999Short Term741.02231892.08 yearsOwn HomeDebt Consolidation29200.5314.929.0181297996750090.00.00.0
377598f7b-32e7-4e3b-a6e5-06ba0d98fe8ae777faab-98ae-45af-9a86-7ce5b33b1011Fully Paid347666Long Term721.0806949.03 yearsOwn HomeDebt Consolidation8741.9012.0NaN90256329386958.00.00.0
4d4062e70-befa-4995-8643-a0de7393818281536ad9-5ccf-4eb8-befb-47a4d608658eFully Paid176220Short TermNaNNaN5 yearsRentDebt Consolidation20639.706.1NaN150253460427174.00.00.0
589d8cb0c-e5c2-4f54-b056-48a645c543dd4ffe99d3-7f2a-44db-afc1-40943f1f9750Charged Off206602Short Term7290.0896857.010+ yearsHome MortgageDebt Consolidation16367.7417.3NaN60215308272448.00.00.0
6273581de-85d8-4332-81a5-19b04ce6866690a75dde-34d5-419c-90dc-1e58b04b3e35Fully Paid217646Short Term730.01184194.0< 1 yearHome MortgageDebt Consolidation10855.0819.610.0131122170272052.01.00.0
7db0dc6e1-77ee-4826-acca-772f9039e1c7018973c9-e316-4956-b363-67e134fb0931Charged Off648714Long TermNaNNaN< 1 yearHome MortgageBuy House14806.138.28.0150193306864204.00.00.0
88af915d9-9e91-44a0-b5a2-564a45c12089af534dea-d27e-4fd6-9de8-efaa52a78ec0Fully Paid548746Short Term678.02559110.02 yearsRentDebt Consolidation18660.2822.633.040437171555038.00.00.0
90b1c4e3d-bd97-45ce-9622-22732fcdc9a0235c4a43-dadf-483d-aa44-9d6d77ae4583Fully Paid215952Short Term739.01454735.0< 1 yearRentDebt Consolidation39277.7513.9NaN2006695601021460.00.00.0

Last rows

Loan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Liens
10999091e26ead-810b-44a0-892f-d623e1e444a01e7cbdfe-d3a9-41a4-8416-3a23b08ef27bFully Paid260128Long Term611.03182519.04 yearsHome MortgageDebt Consolidation79563.0720.740.047013989323673032.00.00.0
1099910594e01c-4230-4280-8267-78cd0c46f7200c7d45a9-ff22-489c-82c0-073d74c46fd0Fully Paid352704Long Term686.0913824.03 yearsRentDebt Consolidation9595.199.0NaN50299592346214.00.00.0
1099924d0d0e65-e9aa-43e4-8de0-8aa2869a9983add5361a-e612-4c58-bd7d-414543250ebcFully Paid428604Long Term697.02183043.06 yearsHome MortgageDebt Consolidation16882.0722.1NaN100385187525316.00.00.0
1099936f0b1e02-d222-4227-9161-b0c4fff4dd76879d5bf4-6597-4f2b-ae7c-9deb68537c88Charged Off220858Short Term737.01234088.0< 1 yearRentDebt Consolidation20362.499.5NaN100273353409442.00.00.0
109994a2701102-3cb0-46a6-8658-e6f20d9501826dc5fa63-93f4-43a9-8192-2df57548287bFully Paid442596Short Term739.01528968.07 yearsHome MortgageDebt Consolidation19494.3816.779.0110419235656876.00.00.0
109995c4ab66f9-833c-43b8-879c-4f8bcb64dd148ee2002b-8fb6-4af0-ab74-25a1c23e7647Charged Off157806Short Term731.01514376.06 yearsRentDebt Consolidation4795.4112.5NaN9087058234410.00.00.0
109996bbd3a392-01b4-4e0e-9c28-b2a4a39beac76c306306-f5c2-4db5-b74a-af2895123ecbFully Paid132550Short Term718.0763192.04 yearsHome MortgageDebt Consolidation12401.879.920.08074309329692.00.00.0
109997da9870de-4280-46a3-8fc6-91cfe5bfde9dcc94e25e-1060-4465-b603-194e122f0239Fully Paid223212Long TermNaNNaNNaNRentDebt Consolidation4354.4227.2NaN8199636568370.01.00.0
1099980cc8e0e0-1bc6-49d7-ad0f-0598b647458ff90cf410-a34b-49e7-8af9-2b405e17b827Fully Paid99999999Short Term721.0972097.010+ yearsHome MortgageDebt Consolidation12232.2016.824.081184984240658.00.00.0
10999914f94b64-26c4-48fd-b916-1388d7adcc1df1838fa9-7ad9-44d5-97a6-7a6d3f3529d7Fully Paid99999999Short Term748.01079960.06 yearsHome MortgageDebt Consolidation12239.6119.7NaN140179018607882.00.00.0

Duplicate rows

Most frequent

Loan IDCustomer IDLoan StatusCurrent Loan AmountTermCredit ScoreAnnual IncomeYears in current jobHome OwnershipPurposeMonthly DebtYears of Credit HistoryMonths since last delinquentNumber of Open AccountsNumber of Credit ProblemsCurrent Credit BalanceMaximum Open CreditBankruptciesTax Lienscount
00018f629-8cef-48bd-bb93-40179f24256c1e96933c-3a01-46b2-975d-06c5a2b469c3Fully Paid66396Short Term711.0535192.03 yearsRentDebt Consolidation9142.8015.850.080112347307538.00.00.02
1001a84a9-3fd5-4e82-9153-49325b996408b282e6f9-2d09-4988-b579-6d90d104e70dFully Paid180246Long Term658.0858097.02 yearsRentOther10289.8318.458.080288553440220.00.00.02
2002f45ea-0555-4498-b56d-d1dffd3e4e0512b867fc-b39e-48e1-9cd6-ff89a5352583Fully Paid314006Short Term720.01259092.010+ yearsHome MortgageDebt Consolidation6746.5229.018.0100181374443982.00.00.02
3003665f5-adff-4fbb-903c-14d022fa6a0880c0ee25-ec87-4e1f-aefc-e5bbd679cef1Fully Paid43692Short Term719.0679079.09 yearsHaveMortgageTake a Trip8601.6811.069.05516264174328.01.04.02
4003d60c3-3a4c-4b81-b0f4-3105c34ce2e89d1e355e-f80d-40a8-b582-d391baa0dce9Fully Paid194370Short Term714.0991952.010+ yearsHome Mortgageother11242.1121.032.060168017448272.00.00.02
50046bac8-1053-4385-8ffe-93dcebd8ee7d4c8dd1fd-1831-4d32-a6e4-cbe4702b2020Fully Paid515064Long Term724.01457851.08 yearsHome MortgageHome Improvements6900.4228.549.011052288535854.00.00.02
600494add-bf6e-4f20-aaa2-03199d2356eede0ad0f6-aae0-4b65-befa-c8ce550aaa15Fully Paid262020Long Term716.01791472.09 yearsHome MortgageDebt Consolidation3866.5022.746.070109744128898.00.00.02
70067ba84-881c-4622-a1e5-f9e8cf171844f05a8b58-791f-4658-bbee-deeedc0c0cd9Fully Paid224488Short Term724.0969380.04 yearsHome MortgageDebt Consolidation16075.7117.936.070118959170676.00.00.02
8006c5c9c-bd62-49f5-bfd4-b0cd69af7a83613fb46d-39ec-49bb-9e8d-34e352a5adc0Fully Paid778866Long Term724.02498405.010+ yearsHome MortgageDebt Consolidation27690.6018.466.060655557918984.00.00.02
90076c53b-8cba-4e9f-a432-13ba69c28fcc719e6f4d-a191-402d-83be-ed0e11c86005Fully Paid197560Short Term711.0817456.02 yearsOwn HomeDebt Consolidation8719.6711.176.0404769080278.00.00.02